A Comprehensive Roman (english)-to-bangla Transliteration Scheme
نویسنده
چکیده
A transliteration scheme from Roman (English) to Bangla can help increase the use of Bangla in essential and diverse computing areas such as word processing, Internet and mobile communication and information query and retrieval. The Bangla script’s irregular phonetic nature and its large repertoire of consonant clusters (juktakkhors) create a large gap between the pronunciation and the orthography for a given Bangla word. In this paper, we describe a comprehensive Roman (English)-to-Bangla transliteration scheme that is designed to handle the full complexity of the Bangla script. We apply a phonetic encoding scheme to produce intermediate code-strings that facilitate matching pronunciations of input strings and the desired outputs. We also provide graceful degradation to a more conventional direct phonetic mapping in special circumstances. A prototype of our scheme shows significant success in test cases.
منابع مشابه
Phonetic Bengali Input Method for Computer and Mobile Devices
Current mobile devices do not support Bangla (or Bengali) Input method. Due to this many Bangla language speakers have to write Bangla in mobile phone using English alphabets. During this time they used to write English foreign words using English spelling. This tendency also exists when writing in computer using phonetically input methods, which cause many typing mistakes. In this scenario, co...
متن کاملHow to Translate Unknown Words for English to Bangla Machine Translation Using Transliteration
Due to small available English-Bangla parallel corpus, Example-Based Machine Translation (EBMT) system has high probability of handling unknown words. To improve translation quality for Bangla language, we propose a novel approach for EBMT using WordNet and International-Phonetic-Alphabet(IPA)-based transliteration. Proposed system first tries to find semantically related English words from Wor...
متن کاملEnglish to Bangla Phrase-Based Machine Translation
Machine Translation (MT) is the task of automatically translating a text from one language to another. In this work we describe a phrase-based Statistical Machine Translation (SMT) system that translates English sentences to Bangla. A transliteration module is added to handle outof-vocabulary (OOV) words. This is especially useful for low-density languages like Bangla for which only a limited a...
متن کاملRomanized Language Identification and Transliteration System for Security with an Authentication System Using Persuasive Cued Click Points - RLITS
Romanized script is popular today for communication in every country, as the script is almost universally enabled in text processors. In countries like India which is a linguistic cauldron, it is very common to see English text in email messages and chat transcripts, with generous sprinkling of words from local languages in roman script. Dubbed as Manglish (Malayalam and English) etc., this rom...
متن کاملResource Creation for Training and Testing of Transliteration Systems for Indian Languages
Machine transliteration is used in a number of NLP applications ranging from machine translation and information retrieval to input mechanisms for non-roman scripts. Many popular Input Method Editors for Indian languages, like Baraha, Akshara, Quillpad etc, use back-transliteration as a mechanism to allow users to input text in a number of Indian language. The lack of a standard dataset to eval...
متن کامل